Streaming Embeddings with Slack

نویسندگان

  • Christiane Lammersen
  • Anastasios Sidiropoulos
  • Christian Sohler
چکیده

We study the problem of computing low-distortion embeddings in the streaming model. We present streaming algorithms that, given an n-point metric space M , compute an embedding of M into an n-point metric space M ′ that preserves a (1−σ)-fraction of the distances with small distortion (σ is called the slack). Our algorithms use space polylogarithmic in n and the spread of the metric. Within such space limitations, it is impossible to store the embedding explicitly. We bypass this obstacle by computing a compact representation of M ′, without storing the actual bijection from M into M ′.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spanners with Slack

Given a metric (V, d), a spanner is a sparse graph whose shortest-path metric approximates the distance d to within a small multiplicative distortion. In this paper, we study the problem of spanners with slack : e.g., can we find sparse spanners where we are allowed to incur an arbitrarily large distortion on a small constant fraction of the distances, but are then required to incur only a cons...

متن کامل

ESTEEM: A Novel Framework for Qualitatively Evaluating and Visualizing Spatiotemporal Embeddings in Social Media

Analyzing and visualizing large amounts of social media communications and contrasting short-term conversation changes over time and geolocations is extremely important for commercial and government applications. Earlier approaches for largescale text stream summarization used dynamic topic models and trending words. Instead, we rely on text embeddings – low-dimensional word representations in ...

متن کامل

Embedding, Distance Estimation and Object Location in Networks

Concurrent with numerous theoretical results on metric embeddings, a growing body of research in the networking community has studied the distance matrix defined by node-to-node latencies in the Internet, resulting in a number of recent approaches that approximately embed this distance matrix into low-dimensional Euclidean space. A fundamental distinction between the theoretical approaches to e...

متن کامل

Distributed Non-Parametric Representations for Vital Filtering: UW at TREC KBA 2014

Identifying documents that contain timely and vital information for an entity of interest, a task known as vital filtering, has become increasingly important with the availability of large document collections. To efficiently filter such large text corpora in a streaming manner, we need to compactly represent previously observed entity contexts, and quickly estimate whether a new document conta...

متن کامل

Online Learning of Interpretable Word Embeddings

Word embeddings encode semantic meanings of words into low-dimension word vectors. In most word embeddings, one cannot interpret the meanings of specific dimensions of those word vectors. Nonnegative matrix factorization (NMF) has been proposed to learn interpretable word embeddings via non-negative constraints. However, NMF methods suffer from scale and memory issue because they have to mainta...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009